Bioinformatics on the Cloud Computing Platform Azure
نویسندگان
چکیده
We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.
منابع مشابه
Performance Analysis of Vertex-centric Graph Algorithms on the Azure Cloud Platform
Finding key vertices in large graphs is an important problem in many applications such as social networks, bioinformatics, and distribution networks. Betweenness centrality is a popular algorithm for finding such vertices and has been studied extensively, yielding several parallel formulations suitable to supercomputers and clusters. In this paper we implement and study betweenness centrality i...
متن کاملCloud Computing for Comparative Genomics with Windows Azure Platform
Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.
متن کاملAn Efficient Bulk Synchronous Parallelized Scheduler for Bioinformatics Application on Public Cloud
Genomic sequence alignment of varied species is one of the most sort of applications in bioinformatics. In future bioinformatics technologies are expected to produce genomic data of terabyte. Bioinformatics computation require super computer for sequence alignment computation which involves huge cost. Parallelization technique is a way forward in computing sequence alignment with limited cost a...
متن کاملTowards an MPI-like Framework for Azure Cloud Platform
Message passing interface (MPI) has been widely used for implementing parallel and distributed applications. The emergence of cloud computing offers a scalable, fault-tolerant, on-demand alternative to traditional on-premise clusters. In this thesis, we investigate the possibility of adopting the cloud platform as an alternative to conventional MPI-based solutions. We show that cloud platform c...
متن کاملWindows Azure Platform: an Era for Cloud Computing
Windows Azure platform is the Microsoft implementation of cloud computing. This paper covers detailed introduction to Windows Azure Platform. Windows Azure provides resources and services for consumers. The next part describes the five main components of Windows Azure: Hardware is abstracted and exposed as compute resources. Physical storage is abstracted as storage resources and exposed throug...
متن کامل